A Path Signature Approach for Speech Emotion Recognition
Abstract
Automatic speech emotion recognition (SER) remains a difficult task within human-computer interaction, despite increasing interest in the research community.
One key challenge is how to effectively integrate short-term characterisation of speech segments with long-term information such as temporal variations.
Motivated by the numerical approximation theory of stochastic differential equations (SDEs), we propose the novel use of path signatures.
The latter provide a pathwise definition to solve SDEs, for the integration of short speech frames.
Furthermore we propose a hierarchical tree structure of path signatures, to capture both global and local information.
A simple tree-based convolutional neural network (TBCNN) is used for learning the structural information stemming from dyadic path-tree signatures.
Our experimental results on a widely used benchmark dataset demonstrate comparable performance to complex neural network based systems.
Index Terms: speech emotion recognition, path signature feature, convolutional neural network
Citations
Bo Wang, Maria Liakata, Hao Ni, Terry Lyons, Alejo J Nevado-Holgado, Kate Saunders. A Path Signature Approach for Speech Emotion Recognition. INTERSPEECH 2019 September 15–19, 2019
Sponsorship: Supported by the NIHR
DOI: http://dx.doi.org/10.21437/Interspeech.2019-2624
URI: https://www.oxfordhealth.nhs.uk/orka/title/a-path-signature-approach-for-speech-emotion-recognition/
Page last reviewed: 12 June, 2025
Metadata
Author(s): External author(s) only
Collection: 123456789/37
Subject(s): Emotions, Mental Disorders, Speech Emotion Recognition
Format(s): Article
Date issued: 2019-09
ISSN: 1990-9772
ID: 297